Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Vietnamese scene text detection based on modified Mask R-CNN
Yate FENG, Yimin WEN
Journal of Computer Applications    2021, 41 (12): 3551-3557.   DOI: 10.11772/j.issn.1001-9081.2021050821
Abstract260)   HTML12)    PDF (1209KB)(90)       Save

In view of the lack of training data for Vietnamese scene text detection and the incomplete detection of Vietnamese tone marks in the detection, a text detection algorithm for Vietnamese scenes based on a modified instance segmentation method Mask R-CNN was proposed. In order to segment Vietnamese scene text with tone marks accurately, only P2 feature layer was utilized to segment the text area, and the mask matrix size of the text area was adjusted from 14 × 14 to 14 × 28 to adapt the shape of most texts. Aiming at the problem that duplicate text detection boxes cannot be eliminated by the conventional Non-Maximum Suppression (NMS) algorithm, a filter module for the text areas named Text region filtering branch was designed and added after the detection module to effectively eliminate duplicate detection boxes. A model joint training method was used to train the network. The training process consists of two parts: the first part is the training of the Feature Pyramid Network (FPN) and the Region Proposal Network (RPN) of the model, which used large-scale open Latin text data for training to enhance the generalization ability of the model to detect text in different scenes; the second part is the training of the candidate box coordinate regression module and the segmentation module named Box branch and Mask branch, which used pixel-level labelled Vietnamese scene text data for training to enable the model to segment the Vietnamese text area including tone marks. Many cross-validation experiments and comparison experiments verify that the proposed algorithm has better precision and recall under different Intersection over Union (IoU) thresholds compared with Mask R-CNN.

Table and Figures | Reference | Related Articles | Metrics